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Abstract — We present the first explicit, and currently sim- 
plest, randomized algorithm for two-process wait-free test- 
and-set. It is implemented with two 4- valued single writer 
single reader atomic variables. A test-and-set takes at most 
11 expected elementary steps, while a reset takes exactly 1 
elementary step. Based on a finite-state analysis, the proofs 
of correctness and expected length are compressed into one 
table. 

I Keywords — Test-and-set objects. Symmetry break- 
ing, Asynchronous distributed protocols. Fault-tolerance, 
Shared memory. Wait-free read/write registers. Atomicity, 
Randomized algorithms. Adaptive adversary. 



I. Introduction 

A test-and-set protocol concurrently executed by each 
, process out of a subset of n processes selects a unique pro- 
' cess from among them. In a distributed or concurrent sys- 
tem, the test-and-set operation is useful and sometimes 
mandatory in a variety of situations including mutual ex- 
clusion, resource allocation, leader election and choice co- 
ordination. It is well-known that in the wait-free setting, 
[ a deterministic construction from atomic read/write 
I variables is impossible |Q . Although widely assumed to 
exist, and referred to, an explicit randomized construc- 
tion for wait-free test-and-set has not appeared in print 
yet, apart from a deterministic construction assuming two- 
. process atomic test-and-set The latter, in the form of 
' a randomized two-process wait-free test-and-set has been 
, circulated in draft form for a decade. Here we finally 
present the construction. Since such constructions are no- 
. toriously prone to hard-to-detect errors, we prove it correct 
' by an exhaustive finite-state proof, thus also presenting a 
, nontrivial application of this proof technique. 

Interprocess Communication: The model is inter- 
, process communication through shared memory as com- 
monly used in the theory of distributed algorithms p6t . 
We use atomic single writer single reader registers as prim- 
itives. Such primitives can be implemented wait-free from 
single-reader single-writer "safe" bits (mathematical ver- 
sions of hardware "flip-flops") |^). A concurrent object 
is constructible if it can be implemented deterministically 
with boundedly many safe bits. A deterministic protocol 
executed by n processes is wait-free if there is a finite func- 
tion / such that every non-faulty process terminates its 
protocol executing a number of at most f{n) of accesses to 
the shared memory primitives, regardless of the other pro- 
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cesses execution speeds. If the execution speed of a process 
drops to zero then this is indistinguishable from the pro- 
cess having a crash failure. As a consequence, a wait-free 
solution can tolerate up to n — 1 processes having crash 
failures (a property called "(n — l)-resiliency"), since the 
surviving non-faulty process correctly executes and termi- 
nates its protocol. Below, we also write "shared variable" 
for "register." 

Randomization: The algorithms executed by each pro- 
cess are randomized by having the process flip coins (ac- 
cess a random number generator). In our randomized al- 
gorithms the answers are always correct — a unique pro- 
cess gets selected — but with small probability the proto- 
col takes a long time to finish. We use the customary as- 
sumption that the coin flip and subsequent write to shared 
memory are separate atomic actions. To express the com- 
putational complexity of our algorithm we use the expected 
complexity, over all system executions and with respect 
to the randomization by the processes and the worst-case 
scheduling strategy of an adaptive adversary. A random- 
ized protocol is wait-free if f{n) upper bounds the expec- 
tation of the number of elementary steps, where the ex- 
pectation is taken over all randomized system executions 
against the worst-case adversary in the class of adversaries 
considered (in our results the adaptive adversaries). 

Complexity Measures: The computational complex- 
ity of distributed deterministic algorithms using shared 
memory is commonly expressed in number and type of in- 
tercommunication primitives required and the maximum 
number of sequential read/writes by any single process in 
a system execution. Local computation is usually ignored, 
including coin-flipping in a randomized algorithm. 

Related Work: What concurrent wait-free object is the 
most powerful constructible one? It has been shown that 
wait-free atomic multi-user variables, and atomic snapshot 
objects, are constructible, for example |^^, |3^, p^, 
||, m, (13, I, g, ll. In contrast, the agree- 

ment problem in the deterministic model of computation 
(shared memory or message passing) is unsolvable in the 
presence of faults , (it) , . Correspondingly, wait- free 
consensus — viewed as an object on which each of n pro- 
cesses can execute just one operation — is not constructible 
although randomized implementations are pos- 
I 01 @' Wait-free concurrent test-and-set 
can deterministically implement two-process wait-free con- 
sensus, and therefore is not deterministically constructible 
[Q, [0. This raises the question of whether randomized 
algorithms for test-and-set exist. 

In it is shown that repeated use of 'consensus' on 
unbounded hardware can implement 'test-and-set'. In |]3ll] , 
Q, it is argued that a bounded solution can be ob- 
tained by combining several intermediate constructions, 
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like so-called "sticky bits", but no explicit construction 
is presented to back up this claim. To quote |^l|] : "ran- 
domized consensus algorithms of Chor, Israeli, and Li [jl^ , 
Abrahamso n , Aspnes and Herlihy and Attiya, Dolev, 
and Shavit [W7 together with our construction imply that 
polynomial number of safe bits is sufficient to convert a safe 
implementation into a (randomized) wait-free one." Any 
such a "layered" construction will require orders of mag- 
nitude more primitive building blocks like one-writer one- 
reader bits than the direct construction we present below. 
Wait-free n-process test-and-set can be implemented de- 
terministically from wait-free two-process test-and-set, 
showing that the impossibility of a deterministic algorithm 
for n-process test-and-set is solely due to the two-process 
case. 

Present Results: Despite the frequent use of random- 
ized wait-free test-and-set in the literature, no explicit con- 
struction for the basic ingredient, randomized wait- free 
two-process test-and-set, has appeared in print. Our con- 
struction, |3^, has been subsumed and referred to long 
since, for example in j2|], ||l5|, but other interests 
prevented us publishing a final version earlier. The con- 
struction is optimal or close to optimal. The presented 
algorithm directly implements wait-free test-and-set be- 
tween two processes from single- writer single-reader atomic 
shared registers. Randomization means that the algorithm 
contains a branch conditioned on the outcome of a fair 
coin flip (as in [^). We use a finite-state based proof 
technique for verifying correctness and worst-case expected 
execution length in the spirit of [p^ . Our construction is 
very simple: it uses two 4-valued 1-writer 1-reader atomic 
variables. The worst-case expected number of elementary 
steps (called "accesses" in the remainder of the paper) in 
a test-and-set operation is 11, whereas a reset always takes 
1 access. 

II. Preliminaries 

Processes are sequentially executed finite programs with 
bounded local variables communicating through single- 
writer, multi-reader bounded wait-free atomic registers 
(shared variables). The latter are a common model for in- 
terprocess communication through shared memory as dis- 
cussed briefly in Section |. For details see |23| , and 
for use and motivation in distributed protocols see |P| , Q , 

A. Shared Registers, Atomicity 

The basic building blocks of our construction are 4- 
valued 1-writer 1-reader atomic registers. Every read/write 
register is owned by one process. Only the owner of a reg- 
ister can write it, while only one other process can read it. 
In one access a process can either: 

• Read the value of a register; 

• Write a value to one of its own registers; 

• Moreover, following the read/write of a register the pro- 
cess possibly flips a local coin (invokes a random number 
generator that returns a random bit), preceded or followed 
by some local computation. 



We require the system to be atomic: every access of a 
process can be thought to take place in an indivisible in- 
stance of time and in every indivisible time instance at 
most one access by one process is executed. The atomicity 
requirement induces in each actual system execution total 
orders on the set of all of the accesses by the different pro- 
cesses, on the set of accesses of every individual process, 
and on the set of read/ write operations executed on each 
individual register. The state of the system gives for each 
process: the contents of the program counter, the contents 
of the local variables, and the contents of the owned shared 
registers. Since processes execute sequential programs, in 
each state every process has at most a single access to be 
executed next. Such accesses are enabled in that state. 

B. Adversary 

There is an adversarial scheduling demon that in each 
state decides which enabled access is executed next, and 
thus determines the sequence of accesses of the system 
execution. There are two main types of adversaries: the 
oblivious adversary that uses a fixed schedule independent 
of the system execution, and the much stronger adaptive 
adversary that dynamically adapts the schedule based on 
the past initial segment of the system execution. Our re- 
sults hold against the adaptive adversary — the strongest 
adversary possible. 

C. Complexity 

The computational complexity of a randomized dis- 
tributed algorithm in an adversarial setting and the cor- 
responding notion of wait-freeness require careful defini- 
tions. For the rigorous novel formulation of adversaries 
as restricted measures over the set of system executions 
we refer to the Appendix of |Q. For the simple appli- 
cation in this paper we can assume that the notions of 
global (system) execution, wait-freeness, adaptive adver- 
sary, and expected complexity are familiar. A randomized 
distributed algorithm is wait-free if the expected number of 
read/ writes to shared memory by every participating pro- 
cess is bounded by a finite function /(n), where n is the 
number of processes. The expectation is taken over the 
probability measure over all randomized global (system) 
executions against the worst-case adaptive adversary. 

III. Test-and-Set Implementation 

We first specify the semantics of the target object: 
Definition III.l: An atomic test-and-set object X is a 

global variable^ associated with n processes Poi ■ • ■ iP-n-i, 

exhibiting the following functionality: 

• The value of X is or 1; 

• Every process Pi has a local binary variable xi which it 
alone can read or write; 

• At any time exactly one of X, xq, . . . , Xn-i has value 0, 
all others have value 1 (we assume the global time model); 

• A process Pi with Xi = 1 can atomically execute a test- 
and-set operation t: 

read Xi :— X] write X :~ 1\ return Xi. 
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• A process Pi with Xi = can atomically execute a reset 
operation p: 
Xi := 1; write X := 0. 
This specification naturaUy leads to the definition of 
the state of the test-and-set object as an element of 
{_L,0,...,n — 1} corresponding to the unique local vari- 
able out oi X, xq, . . . , Xn-i that has value 0. Here _L is the 
state that none of the x^'s is 0. Formally, the specification 



is given later as a finite automaton in Definition V.l 



Since "atomicity" means that the operation is executed in a 
single indivisible time instant, and, moreover, in every such 
time instant at most one operation execution takes place, 
the eff'ect of a test-and-set operation by process Pi is that 
Xi := iff all Xj ^ for all j ^ i, and Xi = 1 otherwise. The 
effect of a reset operation by Pi is only defined for initially 
Xi = and Xj ^ for all j ^ i, and results in Xi := 
1. To synthesise the target object from more elementary 
objects, we have to use a sequence of atomic accesses to 
these elementary objects. By adversary scheduling these 
sequences may be interleaved arbitrarily. Yet we would like 
to have the effect of an atomic execution of the test-and- 
set operations and the reset operations by each process. 
To achieve such a "virtual" atomic execution we proceed 
as follows: 

Definition III. 2: An implementation oi a test-and-set op- 
eration r or a reset operation p by a process P is an 
algorithm executed by P that results in an ordered se- 
quence of accesses of that process to elements of a set 
{Rq, . . . , Rm-i} of atomic shared variables, interspersed 
with local computation and/or local coin flips. The se- 
quence of accesses is determined by the, possibly random- 
ized, algorithm, and the values returned by the "read" 
accesses to shared variables. We denote an access by 
(P, R, A) , meaning that process P executes access A (read 
or write a "0" or "1") on shared variable R. The imple- 
mentation must satisfy the spec ificati on of the target test- 
and-set semantics of Definition III.l restricted to process 
P. Formally, the specification is given later as a finite au- 



tomaton in Definition V.2 



Definition III. 3: A local execution of a process P con- 
sists of the (possibly infinite) sequence of test-and-set op- 
erations and reset operations it executes, according to the 
implementation, each such operation a G {r, p} provided 
with a start time s(a) and a finish time f{a) — we assume 
a global time model. Note that s{a) coincides with the 
time of execution of the first access in the ordered sequence 
consituting a, and /(a) coincides the time of execution of 
the last access in the ordered sequence constituting a. By 
the atomicity of the individual accesses in the global time 
model, all accesses are executed at different time instants. 
In certain cases (which we show to have zero probability) 
it is possible that /(a) is not finite (because the algorithm 
executes infinitely many loops with probability ^ each). 

Definition III. 4: Let the local execution of process Pi 
consist of the ordered sequence of operations a\,a\,... 
(0 < i < n — 1). A global execution consists of the {A, 
where A — {a* : j = 1, 2, . . . , < i < n — 1} and is 
a partial order on the elements of A defined by a —> 6 iff 



/(a) < s{h) (the last access of a precedes the first of b). 
We require that the number of b such that 6 ^ a is finite 
for each a. 

A test-and-set operation or reset operation by a particu- 
lar process may consist of more than one access, and there- 
fore the local executions by the different processes may hap- 
pen concurrently and asynchronously. This has the effect 
that a global execution can correspond to many different 
interleavings. 

Definition III. 5: Consider a global execution. An inter- 
leaving of the accesses by the different processes associ- 
ated with the global execution is a (possibly infinite) to- 
tally ordered sequence (P^,R^,A^), {P'^,R^, A^) . . . , where 
{P'^,R^,A^) is the iih access, respecting 

• The start times and finish times determined by the local 
executions; and 

• the order of the accesses in the local executions. 

The implementation should guarantee that the function- 
ality of the implementation is "equivalent" , in an appro- 
priate sense, to the functionality of the target test-and-set 
object, and in particular satisfies the "linearizability re- 
quirement" 1^ (also called "atomicity" in [p3[). 

Definition III. 6: The system implements the target test- 
and-set object if the system is initially in state _L, and we 
can extend ^ on ^ to a total order ^ on A with an initial 
element, satisfying: 

• From state _L, a successful test-and-set operation r exe- 
cuted by process Pi (setting Xi :— 0) moves the system to 
state i at some time instant in the interval [s(t), /(t)]; 

• from state i, a reset operation p executed by process Pi 
moves the system to state _L at some time instant in the 
interval [s{p)J{p)]; 

« From state i, every operation execution different from a 
reset by process Pi leaves the system invariant in state i; 
and 

• No other state transitions than the above are allowed. 
The implementation must satisfy the specification of the 



target test-and-set semantics of Definition [II. 1. Formally, 



the spec ifica tion is gi ven later as a finite automaton in Def- 



initions V.3 and V.4 



To prove that a protocol executed by all processes is 
an implementation of the target test-and-set object it suf- 
fices to show that every possible interleaving that can be 
produced by the processes executing the protocol in every 
global execution, starting from the ± state, satisfies the 
above requirements. 

IV. Algorithm 

We give a test-and-set implementation between two pro- 
cesses, process Pq and process Pi. The construction uses 
two 4- valued shared read/write variables Pq and Pi. The 
four values are 'me', 'he', 'choose', 'rst' — chosen as a 
mnemonic aid explained below. Process Pi solely writes 
variable P^, its own variable, and solely reads Ri-i- For 
this reason the reads and writes in the protocol don't need 
to be qualified by the shared variables they access. The 
protocol, for process Pi {i = 0,1), is presented as both 
a finite state chart. Figure ^ and as the program below. 
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The state chart representation will simplify the analysis 
later. The transitions in the state chart arc labeled with 
reads r{value) and writes wivalue) of the shared variables, 
where value denotes the value read or written. The 11 
states of the state chart arc split into 4 gronps enclosed by 
dotted lines. Each group is an equivalence class consist- 
ing of the set of states in which process Pj's own shared 
variable Ri has the same value. That is, the states in a 
group are equivalent in the sense that process Pi-i cannot 
distinguish between them by reading Ri. Accordingly, the 
inter-group transitions are writes to Ri , whereas the intra- 
group transitions are reads of Ri-i- Each group is named 
after the corresponding value of the own shared variable 
Ri. The state chart is deterministic, but for a coin flip 
which is modeled by the two inter-group transitions in the 
"choose" group, representing the two outcomes of a fair 
coin flip. Doubly circled states are "idle" states (no op- 
eration execution is in progress), and singly circled states 
are intermediate states in an operation execution that is in 
progress. 

A program representation of the protocol, for process Pi, 
is given below. An occurrence of Ri not preceded by 'write' 
(similarly, not preceded by 'read') as usual refers to 

the last value written to it (resp. read from it). The con- 
ditional 'rnd(true,false)' represents the boolean outcome 
'true' or 'false' of a fair coin flip. The system is initial- 



ized with value 'rst' in shared variables Ro,Ri. In our 

protocol, all assignments to local variables consist of con- 
tents read from shared variables. To simplify, we abbrevi- 
ate statements like "ri_i := J?i_i; while r\-i = do . . . 
ri_i := Ri-i." to "while read = Ri do .. . ". Here, 

Vi is the local variable containing the value last written to 
shared variable Ri and ri_i is the local variable storing the 
last read value of shared variable Ri-i, for process Pi. This 
way, our (writing of the) protocol can dispense with local 
variables altogether. 

test_eind_set : 

if Rt = he AND read Ri-i 7^ rst 

then return 1 

write Ri := me 

while read Ri-i = Ri do 

write Ri ;= choose 

if read Ri-i = he OR 

= choose AND rnd(true, false)) 

then write Ri := me 

else write Ri := he 
if Ri = me 
then return 
else return 1 



5 



reset : 



write Rj :— rst 



It can be verified in the usual way that the state chart 
represents the operation of the program. The intuition is 
easily explained using the state chart. The default situ- 
ation is where both processes are idle, which corresponds 
to being in the 'rst' state. If process Pj starts a test-and- 
set then it writes Ri := me (indicating its desire to take 
the 0), and checks by reading Ri-i whether process Pi-i 
agrees (by not having Ri-i = me). If so, then Pj has suc- 
cessfully completed a test-and-set by obtaining the and, 
implicitly, setting the global variable X := 1 . In this case 
process Pi_j cannot get until process Pi does a reset by 
writing Ri := rst. While Ri = me, process Pi_i can only 
move from state 'me' to state 'notme' and on via states 
'choose', 'tohe' and 'he' to 'tstl', where it completes its 
test-and-set operation by failure to obtain the 0. 

The only complication arises if both processes see each 
other's variable equal to 'me'. In this case they are said 
to disagree or to be in conflict. They then proceed to the 
'choose' state from where they decide between going for 
or 1, according to what the other process is seen to be do- 
ing. (It is essential that this decision be made in a neutral 
state, without a claim of preference for either or 1. If, 
for example, on seeing a conflict, a process would change 
preference at random, then a process cannot know for sure 
whether the other one agrees or is about to write a changed 
preference.) 

The deterministic choices, those made if the other's vari- 
able is read to contain a value different from 'choose', can 
be seen to lead to a correct resolution of the conflict. A 
process ending up in the 'tstl' state makes sure that its 
test-and-set resulting in obtaining the 1 is justified, by re- 
maining in that state until it can be sure that the other 
process has taken the 0. Only if the other process is seen 
to be in the 'rst' state it resumes trying to take the itself. 

Suppose now that process Pi has read Ri^i — choose 
and is about to flip a coin. Assume that process \ — i has 
already moved to one of the states 'tome'/'tohe' (or else 
reason with the processes interchanged). With 50 percent 
chance, process Pi will move to the opposite state as did 
process Pi-i, and thus the conflict will be resolved. 



In the proof of Theorem V.13 (below) we establish that 
the probability of each loop through the 'choose' state is 
at most one half, and the expected number of 'choices' 
(transitions from state choose) is at most two. This indi- 
cates that the worst case expected test-and-set length is 11. 
Namely, starting from the 'tstl' state, it takes 4 accesses 
to get to state 'choose', another 4 accesses to loop back 
to 'choose' and 3 more accesses to reach 'tstO'/'tstl'. The 
reset operation always takes 1 access. 

V. Proof of Correctness 

The proof idea is as follows: We give a specification of a 
correct implementation of two-process test-and-set in the 
form of a finite automaton (Figure ^) . We then show that 
all initial segments of every possible interleaving of accesses 



by two processes Pq and Pi, both executing the algorithm 
of the state chart (Figure |l]), are accepted by the finite 
automaton. Moreover, the sequence of states of the finite 
automaton in the acceptance process induces a linear order 
on the operation execution of the implemented processes 
that extends the partial order induced by the start and fin- 
ish times of the individual operation executions. Thus, the 
implementation is both correct and atomic. Essentially, 
the proof is given by Figure |[ which gives the state of the 
specification finite automaton for every reachable combi- 
nation of states which processes Pq and P\ can attain in 
their respective copies of the state chart (Figure |l|). By 
analysis of the state chart, or Figure ^, we upper bound 
the expectation of the number of accesses of every opera- 
tion execution of the implementation by a small constant. 
Hence the implementation is wait-free. 

Let h be an interleaving corresponding to a global execu- 
tion (^, — >) of two processes running the protocol starting 
from the initial state. Let {s(a),/(a) : a £ A\ be the 
set of time instants that start or finish an operation exe- 
cution, each such time instant corresponding to an access 
(P, P, A). Let B denote the set these accesses. Recall that 
if a is a reset, then we have s(a) = /(a) and there is but a 
single access executing this operation. 

By definition, ft,|P, the restriction of h to the accesses 
in P, completely determines the partial order — >. If, for 
every a G A we can choose a single access (P, P, A)a in the 
sequence of accesses constituting the operation execution of 
a, such that if a ^ & then (P, P, A) a precedes (P, P, A)f, in 
/i, then we are done. Namely, we can imagine an operation 
a as executing atomically at the time instant of atomic 
access (P, P, A)^, and the total order defined by a 6 
iff (P, P, v4)a precedes {P,R,A)b in h, extends the partial 
order ^. Denote the set {{P,R,A)a : a e A} hy C. We 
have to show that for every h as defined above such a C 
can be found. 

Definition V.l: Specification of two-process atomic test- 
and-set: The definition of the target atomic test-and-set for 
two processes, process Pq and process Pi, is captured by 
finite automaton FAl in Figure ^, which accepts all possible 
sequences of atomic test-and-set and reset operations (all 
states final). The states are labeled with the owner of the 0- 
bit. The arcs representing actions of process Pi are labeled, 
whereas the non-labeled arcs represent the corresponding 
actions of process Pq: resulting in setting xi := 1. 

Definition V.2: Specification of wait- free atomic test- 
and-set restricted to a single process: Figure || shows the se- 
mantics required of a correct implementation of a wait-free 
test-and-set object as a finite automaton FA2, that accepts 
all sequences of accesses by a single process Pi [i = 0, 1) 
executing a correct wait-free atomic test-and-set protocol: 
(all states final): 

• the access starting a test-and-set operation execution, 
denoted s(tas), 

• the atomic occurrence of a test-and-set operation execu- 
tion returning 0, denoted tasO, 

• the atomic occurrence of a test-and-set operation execu- 
tion returning 1, denoted tasl. 
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tasl 



Fig. 2 

FAl: Specification of two-process atomic test-and-set object 



• the access finishing a test-and-set operation execution 
returning 0, denoted f(tasO), 

• the access finishing a test-and-set operation execution 
returning 1, denoted f(tasl), 

• the single access corresponding to a complete reset oper- 
ation execution, denoted rst. 

These are the events in B Li C restricted to a process Pi. 
The reason for not splitting a reset operation execution 
into start, atomic occurrence, and finish is that it is im- 
plemented in our protocol as a single atomic write where 
the above three transitions coincide. As before, doubly cir- 
cled states are "idle" states (no operation execution is in 
progress), and singly circled states are intermediate states 
in an operation execution that is in progress. 




f(tasl) /^ ^s(tas)^ 



tasl 



Fig. 3 

FA2: Specification of 1-process wait-free implementation of 
atomic test-and-set 



Definition V.3: Specification of two-process wait-free 
atomic test-and-set: The proof that our implementation is 
correct consists in demonstrating that it satisfies the speci- 
fication in the form of the finite automaton FAS in Figure ^ 
below (again all states are final). 

Formally FAS is the composition of FAl with two 
copies of FA2, in the I/O Automata framework, as fol- 
lows: It is drawn as a cartesian product of the two com- 
ponent processes — transitions of process Pq are drawn ver- 
tically and those of process Pi horizontally. For clarity, 
the transition names are only given once: only for pro- 
cess Pi. Identifying the starts and finishes of test-and- 
set operation executions a with their atomic occurrence 
(P, i?, A)a by collapsing the s() and /() arcs, FAS reduces 
to the atomic test-and-set diagram FAl. Identifying all 
nodes in the same column (row) reduces FAS to FA2 of 
process Pq (process Pi). 

In the states labeled 'a' through 'h', neither process owns 




Km^ Kn)' 



tasl 



Fig. 4 

FA3/FA4: Specification of two-process wait-free atomic 
test-and-set 



the 0; the system is in state ±. In the states labeled 'i' 
through 'n', process 1 owns the 0; the system is in state 1. 
In the states labeled 'o' through 't', process owns the 0; 
and the system is in state 0. 

The broken transitions of Figure ^ correspond to the ac- 
cess (P, P, A)a G C, required for a correct implementation, 
where the atomic execution of operation a can be virtually 
situated. Recall that this is only relevant for a is a test- 
and-set operation, since the reset operation is implemented 
in the protocol already in a single atomic access of a shared 
primitive variable. 

Definition V.4: Let FA4 be the (nondeterministic) finite 
automaton obtained from FAS by turning the broken tran- 
sitions of Figure ^, which correspond to the unknown but 
existing access (P, R, A)a G C where the execution of a can 
be virtually situated, into e-moves. 

Lemma V.5: Acceptance of h\B by FA4 implies that 
{A, —>) is linearizable: partial order —>■ can be extended to 
a total order such that the sequence of operation execu- 
tions in A ordered by =J> satisfy the test-and-set semantics 



specification of Definition |V.l 

Proof: If FA4 accepts h\B, then, corresponding to 
the e moves, we can augment the sequence h\B with an 
access {P,R,A)a in the interval [s{a), f{a)] of each opera- 
tion execution a G A — or select the single access involved if 
s(a) — f{a) as in the case of a reset operation execution — 
to obtain a new sequence h' that is accepted by FAS. By 
the way FAl composes FAS, it accepts h'\C, the subse- 
quence of atomic accesses (P, P, A) a with a E A contained 
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in h' . Furthermore, letting t{a) denote the time of access 
{P,R,A)a, we have a 6 iff t{a) < f{a) < s{b) < t{h). 
Defining a => h \i t{a) < t{b), the total order of accesses 
in h'\C, then => is a total order that extends the partial 
order — >. That is, the sequence of operation executions of 
A, linear ordered by is accepted by FAl. ■ 

Recall that Figure |^ is the state chart of the execution 
of the implementation of an operation by a single process. 
Each process can be in a particular state of the state chart. 
Let {so,si) denote the state of the system with process Pi 
in state Si {i G {0, 1}). 

Definition V. 6: The initial system state is (rst, rst) . A 
system state (so,si) is reachable from the initial system 
state {rst, rst) if there is a sequence h arising from the 
execution of our test-and-set implementation, represented 
by the state chart of Figure |], starting from the initial state 
and ending in state (sq, si). 

Example V. 7: In the initial state both processes are in 
state 'rst'. Process Pq can start a test-and-set by execut- 
ing w{me) and entering state me. Suppose process Pi now 
starts a test-and-set: it executes w{me) and moves to state 
me. Hence, system states (me, rst) and (me, me) are reach- 
able states. 

Definition V. 8: The representative set of a reachable sys- 
tem state (sojSi) is a nonempty set Ssq.sx of FA3/FA4 
states, as in Figure ^ such that: For every sequence of 
accesses h starting in the initial state and ending in state 
(soi si), the set Ssq.si is the set of states in which FA4 can 
be after processing h\B, excluding those states that have 
outgoing moves that are e-moves only. 

Example V.9: We elaborate Example V.7. In the initial 
state both processes are in state 'rst'. The corresponding 
start state d of FA4 gives the associated (in this case sin- 
gleton) representative set {d}. When process Pq executes 
w{me) and enters state me, the resulting system state is 
{me, rst) with the associated representative set {5,p} of 
FA4 states. That is, the system is now either in state g, 
meaning that process Pq hf^s executed s{tas), or in state p 
meaning that process Pq has executed s{tas) and also tasO 
atomically. In the scenario of Example V.7, process Pi now 



executes 'w{me) and moves to state me, resulting in the sys- 
tem state {me, me). The corresponding representative set 
of FA4 states is {i,m,o,q} . State m says process Pi has 
executed s{tas) and tasQ atomically, while process Pq ha-s 
only executed s{tas) — hence the system was previously in 
state g and not in state p. State i says process Pi has 
executed s{tas) and tasO atomically, while process Pq has 
executed s{tas) and tasl atomically — and hence the sys- 
tem was previously in state g and not state p. States o and 
q imply the same state of affairs with the roles of process Pq 
and process Pi interchanged, and the previous system state 
is either p or g. (The correspondence between reachable 
states and their representative sets is exhaustively estab- 



tial segment oi h\B is accepted by FA4 starting from initial 
state 'd'. 

Proof: We show that the set of letters in an entry in 
the table of Figure |^ is a representative set for the state of 
process Pq, indexing the row, and the state of process Pi, 
indexing the column. The entries were chosen excluding all 
states from the representative sets with all outgoing moves 
consisting of e-moves (but the representative sets contain 
the states the outgoing e-moves of the excluded states point 
to). This gives the most insight into the workings of the 
protocol by considering only the result of executing e-moves 
from a state if its only outgoing moves are e-moves. A *- 
entry indicates an unreachable state pair. (The number 
ending an entry gives the expected number of accesses to 
finish the current operation execution of process Pq — and 
by symmetry, that for an equivalent state pair with respect 
to Pi. We will use this later.) Thus, every state {so,si) 
of the implementation execution corresponds with a set of 
states Sso,si of FA4. 

Claim V.ll: The representative sets are given by the en- 
tries of Figure ^. 

Proof: The proof of the claim is contained in the 
combination of Figures ^ ^, ||. Below we give the inductive 
argument. The mechanical verification of the subcases has 
been done by hand, and again by machine. The setting up 
of the exhaustive list subcases and subsequent verification 
by a computer program is the essennce of a finite-state 
proof. In this particular case, exceptionally, the finite state 
machines involved (and the table of representative sets) 
have been minimized so that "mechanical" verification by 
hand by the reader is still feasible. Induction is on the 
length of the sequence of accesses: 

Base Case: Initially, after an empty sequence of accesses, 
FA4 is in the state {d} = Srst,rst- 

Induction: Every non-reachable state has a *-entry in the 
table of Figure |^. Consider an arbitrary atomic transition 
from a reachable state (so,si) to a state {tQ,ti), that is, 
using a single arc in the state chart in Figure |^ for either 
process Pq or Pi. This way, either to = sq or ti = si but 
not both. Then, for every FA4 state y € Sto,ti, Figure ^, 
according to the table of Figure ||, there is an FA4 state 
X e Sso,si according to Figure |4[ such that FA4 can move 
from a; to 2/ by executing: either the access corresponding 
to the transition in the state chart in Figure]^, if that access 
belongs to B, or no access otherwise (there is a sequence 
of e-moves from x to y). 

m 

Since every reachable state of the system (so,si), with Si 
{i G {0, 1}) a state of the state chart of Figure ||, has a rep- 
resentative set in FA4, Figure 0, and every state of of FA4 



is an accepting state, the lemma follows from Claim V.ll 



fished in Claim V.ll below.) ^ 
Lemma V.IO: Let /i be a sequence of accesses arising 
from the execution of our test-and-set implementation, rep- 
resented by the state chart of Figure |^, starting from the 
initial state (both processes in state 'rst'). Then, every ini- 



Theorem V.12: The algorithm represented by state 
chart of Figure ^ correctly implements an atomic test-and- 
set object. 

Proof: By Lemma |V.1C the implementation by the 
state chart in Figure Q correctly implements the specifica- 
tion of two-process test-and-set given by Figure 0. The im- 
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Table verification of correctness and wait-freedom 



plementation is linearizable (atomic) by Lemma V.IC. The 
system makes progress (every operation execution is exe- 
cuted completely except for possibly the last one of each 
process) since h\B contains only the start and finish ac- 
cesses of each operation execution performed by the imple- 
mentation. ■ 

Theorem V.13: The algorithm represented by state 
chart of Figure |l| is wait-free: the expected number of ac- 
cesses to shared variables never exceeds 11 during execution 
of an operation. 

Proof: In Figure |l| every arc is an access. Double cir- 
cled states are idle states (in between completing an oper- 
ation execution and starting a new one). Consider process 
Po (the case for process Pi is symmetrical). The longest 
path without completing an operation and without cycling 
is from state 'tstl': tstl, free, me, notme, choose, tohe, 
he, tstl. This takes 7 accesses. Four of these accesses are 
parts of a potential cycle of length 4. The remainder is 3 
accesses outside the potential cycle. In state 'choose', the 
outgoing arrow is a random choice only when process Pi 
is also in the CHOOSE group. If it is, then with ^ proba- 
bility Pi makes (or has already made) a choice which will 
cause process Po to loop back to the 'choose' state again. 
This can happen again and again. The expected number of 

iterations of loops is X^i^i * (^)' ~ \ ~ ^- Since a 

loop has length 4, this gives a total of expected accesses of 
8 for the loops. Together with 3 non-loop accesses the total 
is at most 11 accesses. Such a computation holds for every 
state in the state chart of Figure the only loop being 
the one discussed but the longest possible path is the one 
starting from 'tstl'. For definiteness, we have in fact com- 
puted the expected number of accesses for every accessible 
state (sq, Si) according to the state chart of Figure ||, and 
added that number to the representative set concerned in 
the table of Figure |5[ Since the expected number of ac- 
cesses is between 1 and 11 for all operation executions, the 
algorithm given by the state chart of Figure |l| is wait-free. 

■ 

To aid intuition, we give an example of checking a few 
transitions below, as well as giving the interpretation. 

Example V.14: We elaborate and continue Exam- 
ples V.7, V.9. In the initial state both processes are in 



state 'rst'. In Figure ^, the table entry dlO gives the cor- 



responding start state d of FA4. The worst-case expected 
number of accesses for a test-and-set by process is 10. 
Process Pq can start a test-and-set by executing w{me) 
and entering state me. The corresponding table entry gp9 
indicates in Figure ^ that the system is now either in state 
g meaning that process Pq has executed s(tas), or in state 
p meaning that process Pq has executed s{tas) and also 
tasO atomically. The expected number of accesses is now 
9 < 10 — 1. Suppose process Pi now starts a test-and-set: it 
executes w(me) and moves to state me. The corresponding 
table entry imoqd gives the system state as one possibil- 
ity in {i, TO, o, q} in Figure |4| and the expected number of 
accesses for execution of test-and-set by process Pq is still 
9. State TO says process Pi has executed s{tas) and tasO 
atomically, while process Pq has only executed s(tas) — 
hence the system was previously in state g and not in state 
p. State i says process Pi has executed s(tas) and tasO 
atomically, while process Pq has executed s{tas) and tasl 
atomically — and hence the system was previously in state 
g and not state p. States o and q imply the same state 
of affairs with the roles of process Pq and process Pi in- 
terchanged, and the previous system state is either p or 
9- 

Note that at this point the system can also be in state 
h of FA4 — both processes having executed s{tas) but no 
process having executed tasO or tasl. However, from h 
there are two e-moves possible, and no other moves, leading 
to q and TO. This corresponds to the fact that if both 
processes have executed s(tos), one of them must return 
and the other one must return 1. We have optimized 
the table entries by eliminating such spurious intermediate 
states h with outgoing moves that are e-moves only. 

Process Pq might now read Pi = me, and move via state 
'notme' (table entry imoqS) by writing Rq :— choose, to 
state 'choose'. Process Pi is idle in the meantime. The 
table entry is now i3. This says that process Pi has atomi- 
cally executed tstO, and process Pq has atomically executed 
tstl. Namely, all subsequent schedules lead in 3 accesses 
of process Pq to state 'tstl' — hence the expectation 3. 

The expected number of remaining accesses of pro- 
cess Po's test-and-set has dropped from 8 to 3 by the 
last access since 8 was the worst-case which could be 
forced by the adversary. Namely, from the system in state 
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{notme,me), the adversary can schedule process Pi to 
move to (notme, notme) with table entry imoqS, followed 
by a move of process Pi to state (notme, choose) with ta- 
ble entry imoqS, followed by a move of process Pq to state 
(choose, choose) with table entry imoqj . Suppose the ad- 
versary now schedules process Pq- It now flips a fair coin to 
obtain the conditional boolean rnd(true, false). If the out- 
come is true, then the system moves to state (tome, choose) 
with entry imoq6. If the outcome is false, then the sys- 
tem moves to state (tohe, choose) with table entry imoqG. 
Given a fair coin, this access of process Pq correctly decre- 
ments the expected number of accesses. Suppose the adver- 
sary schedules process Pi in state (choose, choose). Pro- 
cess Pi flips a fair coin. If the outcome is true the sys- 
tem moves to state (choose, tome) with table entry imoql; 
if the outcome is false then the system moves to state 
(choose, tohe) with table entry imoql . <^ 

VI. Remark on Multi-Process Test-And-Set 

The obvious way to extend the given solution to more 
than two processes would be to arrange them at the leaves 
of a binary tree. Then, a process wishing to execute an n- 
process test-and-set, would enter a tournament, as in p9| , 
by executing a separate two-process test-and-set for each 
node on the path up to the root. When one of these fails, it 
would again descend, resetting all the tas-bits on which it 
succeeded, and return 1. When it succeeds ascending up to 
the root, it would return and leave the resetting descend 
to its n-process reset. 

The intuition behind this tree approach is that if a pro- 
cess i fails the test-and-set at some node N , then another 
process j will get to the root successfully and thus justify 
the value 1 returned by the former. 

The worst case expected length of the n-process opera- 
tions is only log n (binary logarithm) times more than that 
of the two-process case. 

Unfortunately, this straightforward extension does not 
work. The problem is that the other process j need not be 
the one responsible for the failure at node N , and might 
have started its n-process test-and-set only after process i 
completes its own. Clearly, the resulting history cannot be 
linearized. 

Nonetheless, it turns out that with a somewhat more 
complicated construction we can deterministically imple- 
ment n-process test-and-set using two-process test-and-set 
as primitives |^. This shows that the impossibility of de- 
terministic wait-free atomic n-process test-and-set is com- 
pletely due to the impossibility of deterministic wait-free 
atomic two-process test-and-set. This latter problem we 
have just solved by a simple direct randomized algorithm. 
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